NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A NUMA-Aware Version of an Adaptive Self-Scheduling Loop Scheduler

https://doi.org/10.1145/3680549

Booth, Joshua Dennis; Lane, Phillip (December 2024, ACM Transactions on Architecture and Code Optimization)

Parallelizing code in a shared-memory environment is commonly done utilizing loop scheduling (LS) in a fork-join manner as in OpenMP. This manner of parallelization is popular due to its ease to code, but the choice of the LS method is important when the workload per iteration is highly variable. Currently, the shared-memory environment is evolving in high-performance computing as larger chiplet-based processors with high core counts and segmented L3 cache are introduced. These processors have a stronger non-uniform memory access (NUMA) effect than the previous generation of x86-64 processors. This work attempts to modify the adaptive self-scheduling loop scheduler known asiCh(irregularChunk) for these NUMA environments while analyzing the impact of these systems on default OpenMP LS methods. In particular,iChis as a default LS method for irregular applications (i.e., applications where the workload per iteration is highly variable) that guarantees “good” performance without tuning. The modified version, namedNiCh, is demonstrated over multiple irregular applications to show the variation in performance. The work demonstrates thatNiChis able to better handle architectures with stronger NUMA effects, and particularly is better thaniChwhen the number of threads is greater than the number of cores. However,NiChalso comes with being less universally “good” thaniChand a set of parameters that are hardware dependent.
more » « less
Full Text Available
Neural acceleration of incomplete factorization preconditioning

https://doi.org/10.1007/s00521-024-10392-y

Booth, Joshua Dennis; Sun, Hongyang; Garnett, Trevor (January 2025, Neural Computing and Applications)

Full Text Available
To Protect or Not To Protect: Probability-Aware Selective Protection for Sparse Iterative Solvers

https://doi.org/10.1109/SBAC-PAD63648.2024.00028

Johnson, Daniel Ryley; Sun, Hongyang; Booth, Joshua Dennis; Raghavan, Padma (November 2024, IEEE)

Full Text Available
Dynamic Selective Protection of Sparse Iterative Solvers via ML Prediction of Soft Error Impacts

https://doi.org/10.1145/3624062.3624117

Chen, Zizhao; Verrecchia, Thomas; Sun, Hongyang; Booth, Joshua; Raghavan, Padma (November 2023, ACM)
Neural Acceleration of Graph Based Utility Functions for Sparse Matrices

https://doi.org/10.1109/ACCESS.2023.3262453

Booth, Joshua Dennis; Bolet, Gregory S. (January 2023, IEEE Access)

Search for: All records